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Transparent Persistence with Java Data Objects 

Julius Hfivnac 
LAL, Orsay, France 

Flexible and performant Persistency Service is a necessary component of any HEP Software Frame- 
work. The building of a modular, non-intrusive and performant persistency component have been 
shown to be very difficult task. In the past, it was very often necessary to sacrifice modularity to 
achieve acceptable performance. This resulted in the strong dependency of the overall Frameworks 
on their Persistency subsystems. 

Recent development in software technology has made possible to build a Persistency Service which 
can be transparently used from other Frameworks. Such Service doesn't force a strong architectural 
constraints on the overall Framework Architecture, while satisfying high performance requirements. 
Java Data Object standard (JDO) has been already implemented for almost all major databases. 
It provides truly transparent persistency for any Java object (both internal and external). Objects 
in other languages can be handled via transparent proxies. Being only a thin layer on top of a used 
database, JDO doesn't introduce any significant performance degradation. Also Aspect-Oriented 
Programming ( AOP) makes possible to treat persistency as an orthogonal Aspect of the Application 
Framework, without polluting it with persistence-specific concepts. 

All these techniques have been developed primarily (or only) for the Java environment. It is, 
however, possible to interface them transparently to Frameworks built in other languages, like for 
example C++. 

Fully functional prototypes of flexible and non-intrusive persistency modules have been build 
for several other packages, as for example FreeHEP AIDA and LCG Pool AttributeSet (package 
Indicium). 



I. JDO 

A. Requirements on Transparent Persistence 

The Java Data Object (JDO) 0] , , , standard 
has been created to satisfy several requirements on the 
object persistence in Java: 

• Object Model independence on persis- 
tency: 

— Java types are automatically mapped to 
native storage types. 

— 3rd party objects can be persistified (even 
when their source is not available). 

— The source of the persistent class is the 
same as the source of the transient class. 
No additional code is needed to make a 
class persistent. 

— All classes can be made persistent (if it has 
a sense) . 

• Illusion of in-memory access to data: 

— Dirty instances (i.e. objects which have 
been changed after they have been read) 
are implicitly updated in the database. 

— Catching, synchronization, retrieval and 
lazy loading are done automatically. 

— All objects, referenced from a persistent 
object, are automatically persistent {Per- 
sistence by reachability). 



• Portability across technologies: 

— A wide range of storage technologies 
(relational databases, object-oriented 
databases, files,...) can be transparently 
used. 

— All JDO implementations are exchange- 
able. 

• Portability across platforms is automati- 
cally available in Java. 

• No need for a different language (DDL, 
SQL,. . . ) to handle persistency (incl. queries). 

• Interoperability with Application Servers 
(EJB E3,...)- 



B. Architecture of Java Data Objects 

The Java Data Objects standard (Java Community 
Process Open Standard JSR-12) [3j has been created 
to satisfy the requirements listed in the previous para- 
graph. 

The persistence capability is added to a class by the 
Enhancer (as shown in Figure ^): 

• Enhancer makes a transient class PersistenceCa- 
pable by adding it all data and methods needed 
to provide the persistence functionality. After 
enhancement, the class implements Persistence- 
Capable interface (as shown in Figure [2J|. 
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MyClass.java 




FIG. 1: JDO Enhancement. 



Persistence Capable 
void jdolsPersistentQ; 
void jdoisNewQ; 
void jdolsDeletedQ; 
void jdolsTransactionaiQ; 
void jdoisDirtyO; 
void jdoMakeDirtyO; 

Pe rsisten ceManager jdo GetPersistenceManage rQ 
Object getObjectldO; 



ExtentFactory 
Collection getExfent(Class pc, boolean subclasses); 



TransactionFactoiy 
Transaction currentTransactionQ; 



i 



► Extent 



Transaction 



Enhancer 



FIG. 2: Enhancer makes any class PersistenceCapable. 



• Enhancer is generally applied to a class-file, but 
it can be also part of a compiler or a loader. 

• Enhancing effects can be modified via Persis- 
tence Descriptor (XML file). 

• All enhancers are compatible. Classes enhanced 
with one JDO implementation will work auto- 
matically with all other implementations. 

The main object, a user interacts with, is the Per- 
sistenceManager. It mediates all interactions with the 
database, it manages instances lifecycle and it serves 
as a factory for Transactions, Queries and Extents (as 
described in Figure EJ. 



QueryFactory 
Query newQueryO; 



Query 



_L 



-L 



PersistenceManager 



FIG. 3: All interactions with JDO are mediated by Per- 
sistenceManager. 



C. Available Implementations 

After about a year since the JDO standardiza- 
tion, there are already many implementations avail- 
able supporting all existing storage technologies. 



D. JDO Implementations 

1. Commercial JDO Implementations 

Following commercial implementations of JDO 
standard exist: 
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cnJin(Versant), FastObjects(Poet), Frontier- 
Suit(ObjectFrontier), IntelliBO (Signsoft), JDO- 
Genie(Hemisphere), JRelay (Object Industries), 
KODO(SolarMetric), LiDO(LIBeLIS), OpenFu- 
sion(Prism), Orient(Orient), PE:J(HYWY), . . . 

These implementation often have a free community 
license available. 



2. Open JDO Implementations 

There are already several open JDO implementa- 
tions available: 

• JDORI 5] (Sun) is the reference and stan- 
dard implementation. It currently only works 
with the FOStore files. Support for a relational 
database via JDBC implementation is under de- 
velopment. It is the most standard, but not the 
most performant implementation. 

• TJDO @ (SourceForge) is a high quality im- 
plementation originally written by the TreeAc- 
tive company, later put on the GPL license. It 
supports all important relational databases. It 
supports an automatic creation of the database 
schema. It implements full JDO standard. 

• XORM (SourceForge) does not yet support 
full JDO standard. It does not automatically 
generate a database schema, on the other hand, 
it allows a reuse of existing schemas. 

• JORM [| (JOnAS/ObjectWeb) has a fully 
functional object-relational mapping, the full 
JDO implementation is under development. 

• OJB [jj (Apache) has a mature object- 
relational engine. Full JDO interface is not yet 
provided. 



E. Supported Databases 

All widely used databases are already supported ei- 
ther by their provider or by a third party: 

• RDBS and ODBS: Oracle, MS SQL Server, 
DB2, PointBase, Cloudscape, MS Access, 
JDBC/ODBC Bridge, Sybase, Interbase, In- 
stantDB, Informix, SAPDB, Postgress, MySQL, 
Hypersonic SQL, Versant,. . . 

• Files: XML, FOSTORE, flat, C-ISAM,. . . 

The performance of JDO implementations is deter- 
mined by the native performance of a database. JDO 
itself introduces a very small overhead. 



II. HEP APPLICATIONS USING JDO 

A. Trivial Application 

A simple application using JDO to write and read 
data is shown in Listing [J 

B. Indicium 

Indicium 0] has been created to satisfy the 
LCG [12 Pool requirements on the Meta- 
data management: "To define, accumulate, search, 
filter and manage Attributes (Metadata) exter- 
nal/additional to existing (Event) data." Those meta- 
data are a generalization of the traditional Paw ntuple 
concept. They are used in the first phase of the analy- 
sis process to make a pre-selection of Event for further 
processing. They should be efficient. They are appar- 
ently closely related to Collections (of Events). 

The Indicium package provides an implementation 
of the AttributeSet (Event Metadata, Tags) for the 
LCG/Pool project in Java and C++ (with the same 
API) . The core of Indicium is implemented in Java. 

All expressed requirements can only be well satisfied 
by the system which allows in principle any object to 
act as an AttributeSet. Such system can be easily 
built when we realize that mentioned requirements arc 
satisfied by JDO: 

• AttributeSet is simply any Object with a ref- 
erence to another (Event) Object. 

• Explicit Collection is just any standard Java 
Collection. 

• Implicit Collection (i.e. all objects of some 
type T within a Database) is directly the JDO 
Extent. 

Indicium works with any JDO/DB implementa- 
tion. As all the requirements are directly satisfied 
by the JDO itself, the Indicium only implements a 
simple wrapper and a code for database management 
(database creation, opening, . . . ). That is in fact the 
only database-specific code. 

It is easy to switch between various JDO/DB im- 
plementations via a simple properties file. The de- 
fault Indicium implementation contains configuration 
for JDORI with FOStore file format and TJDO with 
Cloudscape or MySQL databases, others are simple to 
add. 

The data stored by Indicium are accessible also via 
native database protocols (like JDBC or SQL) and 
tools using them. 

As it has been already mentioned, Indicium pro- 
vides just a simple convenience layer on top of JDO 
trying to capture standard AttributeSet usage pat- 
terns. There are four ways how AttributeSet can be 
defined: 
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1 1 Initialization 

Per sistenceManager Factory pmf = J DO Helper, get Per sistenceManager Factory (properties); 
PersistenceManager pm — pmf .getPer sistenceManager (); 
Transaction tx = pm.currentTransactionQ; 

I j Writing 
tx.beginQ; 

Event event — . . . ; 
pm.makePersistent(event) ; 

tx.commitQ; 

j I Searching using Java — like query language translated internally to DB native query language 

II (SQL available too for RDBS) 

tx.beginQ; 

Extent extent = pm.getExtent(Track. class, true); 
String filter = "pt > 20.0"; 
Query query — pm. new Query (extent, filter); 
Collection results = query .executeQ; 

tx.commitQ; 

Listing I: Trivial example of using JDO. 



• Assembled AttributeSet is fully constructed at 
run-time in a way similar to classical Paw ntu- 
ples. 

• Generated AttributeSet class is generated 
from a simple XML specification. 

• Implementing AttributeSet can be written by 
hand to implement the standard AttributeSet 
Interface. 

• FreeStyle AttributeSet can be just about any 
class. It can be managed by the Indicium in- 
frastructure, only some convenience functional- 
ity may be lost. 

To satisfy also the requirements of C++ users, the 
CH — h interface of Indicium has been created in the 
form of JACE [14( proxies. This way, C++ users can 
directly use Indicium Java classes from a CH — h pro- 
gram. CIndicium Architecture is shown in Figure 0] 
an example of its use is shown in Listing [H] 

C. AID A Persistence 

JDO has been used to provide a basic persistency 
service for the FreeHEP [12| reference implementation 
of AIDA ^3 • Three kinds of extension to the existing 
implementation have been required: 



• Implementation of the IStore interface as Aida- 
JDOStore. 

• Creation of the XML description for each AIDA 
class (for example see Listing ILTT|) . 

• Several small changes to exiting classes, like cre- 
ation of wrappers around arrays of primitive 
types, etc. 

It has become clear, that the AIDA persistence API 
is not sufficient and it has to be made richer to allow 
more control over persistent objects, better searching 
capabilities, etc. 

D. Minerva 

Minerva [l3| is a lightweight Java Framework which 
implements main Architectural principles of the AT- 
LAS C++ Framework Athena [flf: 

• Algorithm - Data Separation: Algorithmic 
code is separated from the data on which it op- 
erates. Algorithms can be explicitly called and 
don't a have persistent state (except for param- 
eters). Data are potentially persistent and pro- 
cessed by Algorithms. 

• Persistent - Transient Separation: The Per- 
sistency mechanism is implemented by specified 
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FIG. 4: CIndicium - C++ interface to Indicium. 



// Construct Signature 
Signature signature^ AssembledClass"); 
signature. add^ j" , " int" , " Somelnteger Number" ) ; 
signature. add(" y" , " double" , " SomeDoubleNumber" ) ; 
signature. add(" s" , " String" , " SomeString" ) ; 

// Obtain Accessor to database 

Accessor accessor — Accessor Factory :: create Accessor^ MyDB .properties" 
II Create Collection 

accessor. createCollection(" MyC ollection" , signature, true); 

II Write Attributes ets into database 
AssembledAttributeSet * as; 
for(int i = 0;i < 100; i + +){ 

as = new AssembledAttributeSet(signature); 

as- > set("j",...) 

as- > set("y",...) 

as- > set("s",...) 

accessor. write(*as); 

} 

II Search database 

std :: string filter — "y> 0.5"; 

Query query = accessor. newQuery {filter); 

Collection collection — query .executeQ; 

std :: cout « "First : " << collection.toArray()[0].toString() << std :: endl; 



Listing II: Example of CIndicium use. 
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< jdo > 




< package name 


= " hep. aida.ref. histogram" > 


<C class name = 


" Histogram2D" 


persistence — 


capable — superclass = " hep.aida.ref. histogram. Histogram" > 


< /class > 




< /package > 




< /jdo > 





Listing III: Example of JDO descriptor for AIDA class. 



components and have no impact on the defini- 
tion of the transient Interfaces. Low-level Per- 
sistence technologies can be replaced without 
changing the other Framework components (ex- 
cept for possible configuration). A specific defi- 
nition of Transient-Persistent mapping is possi- 
ble, but is not required. 

• Implementation Independence: There are 
no implementation-specific constructs in the def- 
inition of the interfaces. In particular, all Inter- 
faces are defined in an implementation indepen- 
dent way. Also all public objects (i.e. all ob- 
jects which are exchanged between components 
and which subsequently appear in the Interface' 
definitions) are identifiable by implementation 
independent Identifiers. 

• Modularity: All components are explicitly de- 
signed with interchangeability in mind. This 
implies that the main deliverables are simple 
and precisely defined general interfaces and ex- 
isting implementation of various modules serves 
mainly as a Reference implementation. 

Minerva scheduling is based on InfoBus|S] Architec- 
ture: 

• Algorithms are Data Producers or Data Con- 
sumers (or both). 

• Algorithm declare their supported I/O types. 

• Scheduling is done implicitly. An Algorithm 
runs when it has all its inputs ready. 

• Both Algorithms and Services run as (static or 
dynamic) Servers. 

• The environment is naturally multi-threaded. 

Overview of the Minerva Architecture is shown in 
Figure [5] 

It is very easy to configure and run Minerva. For 
example, one can create a Minerva run with 5 par- 
allel Servers. Two of them are reading Events from 
two independent databases, one is processing each 
Event and two last write new processed Events on 



new Algorithm(< Algorithm properties >); 

new ObjectOutput(< db3 >, < Event properties! >); 

new ObjectOutput(< db<i >, < Event properties'! >); 

new ObjectInput(< dbl >); 

new ObjectInput(< db2 >); 

Listing IV: Example of steering script for a Minerva run. 



two new databases depending on the Event charac- 
teristics. (See Figure for a schema of such run and 
Listing llVl for its steering script.) 

Minerva has also simple but powerful modular 
Graphical User Interface which allows to p lug in eas- 
ily other components as the BeanShell [15| command- 
line interface, the JAS histogramming, the Ob- 
jectBrowser |18|. etc. Figure [7| and Figure 00 show 
examples of running Minerva with various interactive 
plugins loaded. 

III. PROTOTYPES USING JDO 

A. Object Evolution 

It is often necessary to change object' shape while 
keeping its content and identity. This functionality 
is especially needed in the persistency domain to sat- 
isfy Schema Evolution (Versioning) or Object Mapping 
(DB Projection), i.e. retrieving an Object of type A 
dressed as an Object of another type B. This func- 
tionality is not addressed by JDO. In practice, it is 
handled either on the lower lever (in a database) or 
on the higher level (in the overall framework, for ex- 
ample EJB). 

It is, however, possible to implement an Object Evo- 
lution for JDO with the help of Dynamic Proxies and 
Aspects. 

Let's suppose that a user wants to read an Object of 
a type A (of an Interface IA) dressed as an Object of 
another Interface IB. To enable that, four components 
should co-operate (as shown in Fig EJ : 

• JDO Enhancer enhances class A so it is Persis- 
tenceCapable and it is managed by JDO Persis- 
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FIG. 5: Minerva is based on the InfoBus scheduling and the JDO persistency. 
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FIG. 6: Example of a Minerva run. 



tenceManager. 

AspectJ |l9j adds read-callback with the map- 
ping A — » IB. This is called automatically when 
JDO reads an object A. 

A simple database of mappers provides a suit- 



able mapping between A and IB. 

• DynamicProxy delivers the content of the 
Object A with the interfaces IB: 
IB b = (IB)DynamicProxy.newInstance(A, IB);. 

All those manipulations are of course hidden from the 
End User. 



B. Foreign References 

HEP data are often stored in sets of independent 
databases, each one managed independently. This ar- 
chitectures do not directly support references between 
objects from different databases (while references in- 
side one database are managed directly by the JDO 
support for Persistence by Reachability). As in the 
case of the Object Evolution, foreign references are 
usually resolved either on the lower level (i.e. all 
databases are managed by one storage manager and 
JDO operates on top) or on the higher level (for ex- 
ample by the EJB framework). 

Another possibility is to use a similar Architec- 
ture as in the case of Object Evolution with Dynamic 
Proxy delivering foreign Objects. 

Let's suppose, that a User reads an object A, which 
contains a reference to another object B, which is actu- 
ally stored in a different database (and thus managed 
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FIG. 7: Running set of Producers and Consumers created from a script inside Minerva. 
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FIG. 8: Using ObjectBrowser to inspect Algorithm inside Minerva. 




object B can be transparently retrieved using three 
co-operating components (as shown on Fig 110(1 : 

• When reference from an object A to an object 
B is requested, JDO delivers DynamicProxy in- 
stead. 



• The DynamicProxy asks PersistenceManager- 
Factory for a PersistenceManager which handles 
the object B. It then uses that PersistenceMan- 
ager to get the object B and casts itself into it. 



FIG. 9: Support for Object Evolution. 



by a different PersistenceManager). The database 
with the object A doesn't in fact in this case con- 
tain an object B, but a DynamicProxy object. The 



• PersistenceManagerFactory gives this informa- 
tion by interrogating DBcatalog (possibly a Grid 
Service). 

All those manipulations are of course hidden from the 
End User. 
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FIG. 10: Support for Foreign References. 



IV. SUMMARY 



plications. 

Two major characteristics of persistence solutions 
based on JDO are: 

• Not intrusiveness. 

• Wide range of available JDO implementations, 
both commercial and free, giving access to all 
major databases. 

JDO profits from the native databases functionality 
and performance (SQL queries,...), but presents it to 
users in a native Java API. 



It has been shown that JDO standard provides suit- 
able foundation of the persistence service for HEP ap- 



More details talk about JDO: 

|http: / /hriv nac.home.cern.ch/hrivnac/Activities/2002 / June/ JDO 

More details talk about JDO: 

|http:/ /hriv nac.home.cern.ch/hrivnac/Activities/2002/November/Indicium 
Java Data Objects Standard: 
|http:/ /java.sun.com/prod ucts /jdo 
Java Data Objects Portal: 
http:/ /www .jdocentral.com 
JDO Reference Implementation (JDORI): 
http:/ /accessl. sun.com/jdo| 
TJDO: 

http:/ /tjdo. sourceforge.net 
XORM: 

http:/ /xorm. sourceforge.net 
JORM: 

http:/ /jorm. objectweb.org 
OJB: 

|http : / /d~a pache . org/o j b/ 

Indicium: 

http: / /hrivnac.home.cern.ch /hrivnac / Activities /Packages /Indicium 
AIDA: 

http: / / aida.freehe p.orgl 

FreeHEP Libr ary: 

http:/ /java.freehep.org 

Minerva: 

http: / /hrivnac.home.cern.ch /hrivnac / Activities/Packages/Minerva 
JACE: 

http:/ /reyelts.dyndns.org:8080/jace/release/docs/index.html 
Lightweight Scripting for Java (BeanShell): 
http:/ /www. beanshell.org 

InfoBus: 

http: / /java.sun.com / products /javabeans /infobus / 
Java Analysis Studio (JAS): 
http:/ / jas.freehep.org 

Object Browser: 

http:/ /hrivnac. home. cern.ch/hrivnac/Activities/Packages/ObjectBrowser/ 
Aspect J : 

http:/ /www. eclipse.org/aspectj/ 
Enterprise Java Beans (EJB): 
http : //j ava.sun .com / products /ejb 

ATLAS C++ Framework (Athena): 

http://atlas.wcb.ccrn.ch/ATLAS/GROUPS/SOFTWARE/00/architccturc/Gcncral/indcx.html 
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[22] LCG Computing Grid Project (LCG): 

|http:/ /w enaus . home . cern . ch/ wenaus/peb-app 
[23] LCG Persistency Framework (Pool): 

http:/ /lcgapp. cern. ch/project /persist 
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